Ku80 is involved in telomere maintenance but dispensable for genomic stability in Leishmania mexicana

Background Telomeres are indispensable for genome stability maintenance. They are maintained by the telomere-associated protein complex, which include Ku proteins and a telomerase among others. Here, we investigated a role of Ku80 in Leishmania mexicana. Leishmania is a genus of parasitic protists of the family Trypanosomatidae causing a vector-born disease called leishmaniasis. Methodology/Principal findings We used the previously established CRISPR/Cas9 system to mediate ablation of Ku80- and Ku70-encoding genes in L. mexicana. Complete knock-outs of both genes were confirmed by Southern blotting, whole-genome Illumina sequencing, and RT-qPCR. Resulting telomeric phenotypes were subsequently investigated using Southern blotting detection of terminal restriction fragments. The genome integrity in the Ku80- deficient cells was further investigated by whole-genome sequencing. Our work revealed that telomeres in the ΔKu80 L. mexicana are elongated compared to those of the wild type. This is a surprising finding considering that in another model trypanosomatid, Trypanosoma brucei, they are shortened upon ablation of the same gene. A telomere elongation phenotype has been documented in other species and associated with a presence of telomerase-independent alternative telomere lengthening pathway. Our results also showed that Ku80 appears to be not involved in genome stability maintenance in L. mexicana. Conclusion/Significance Ablation of the Ku proteins in L. mexicana triggers telomere elongation, but does not have an adverse impact on genome integrity.


Introduction
Leishmania is a genus of parasitic protists of the family Trypanosomatidae [1,2]. It causes a vector-born disease called leishmaniasis, which affects vertebrates, including humans [3]. A total of 350 million people is at risk of infection with over 1 million new cases and 20,000-30,000 deaths being documented annually [4]. It is manifested in three different clinical forms-cutaneous, mucosal, and visceral. The majority of the visceral cases are fatal, if left untreated [5].
Ku70 and Ku80 are abundant and conserved DNA binding proteins remarkable for their versatile function [6]. They are best known for their essential role in maintaining genome integrity via non-homologous end joining (NHEJ) repair pathway, one of the two most commonly used pathways to repair double-stranded DNA breaks (DSBs) [7]. Being part of NHEJ, Ku proteins have far-reaching effects on such processes as V(D)J recombination and class switch recombination [8,9]. In NHEJ, the Ku protein complex is involved in the very initial step recognizing DSBs and binding there in the sequence-independent manner to protect it from the nucleolytic degradation. It later recruits a catalytic subunit of the DNA-dependent protein kinase and other proteins facilitating formation of the repair complex [10,11]. The NHEJ pathway is fairly conserved in all the studied species, yet there is a group of organisms that tend to lose it-parasites [12]. It is generally accepted that iconic parasites, trypanosomatids, do not employ NHEJ, but rather rely on sequence microhomology, to repair DSBs in the Ku heterodimer-independent manner [13]. This notion is further substantiated by the fact that ligase IV (an enzyme involved in the classical NHEJ) is absent from trypanosomatid genomes [14]. Nevertheless, both Ku70 and Ku80 are retained by virtually all (with just one exception) trypanosomatids, arguing that they may be used in other biochemical pathways. That sole exception is Blastocrithidia spp., which lack both Ku70 and Ku80 [15]. Notably, the genomes of these species have accumulated numerous insertions in protein-coding genes raising a question, whether such a level of genome instability may be associated with the loss of Ku proteins [12].
Apart from their indispensable role in DSB repair, the Ku proteins are also involved in other cellular processes, such as transcription, DNA damage response, DNA replication, and telomere maintenance, to name just a few [11,16]. At the chromosomal termini, they form complexes with other telomere-associated proteins protecting the DNA ends from being recognized and processed as DSBs [17] and preventing inappropriate recombination events involving telomeric repeats [18]. Surprisingly, the Ku proteins deficiency has manifested in rather ambiguous telomeric phenotypes in different species [19][20][21][22][23][24]. In a model trypanosomatid species, Trypanosoma brucei, complete ablation of the Ku80 protein resulted in gradual shortening of the telomeric repeats, a phenotype that was rescued by the expression of an ectopic copy of the gene [20]. The Ku protein knock-outs have been shown to influence the length of telomeres and trigger the alternative telomere lengthening (ALT) pathways that are not active under normal circumstances [25,26]. The canonical way of dealing with the "end replication problem" (an intrinsic inability of DNA polymerases to accomplish complete lagging strand synthesis of linear DNA templates leading to telomere shortening) [27,28] relies on engagement of a ribonucleoprotein enzyme, telomerase (TERT) [29]. The ALT pathways are TERT-independent and involve such molecular mechanisms as break-induced replication, replication using extrachromosomal DNA or a t-loop elongation [30,31]. Telomeres maintained by ALT are typically long and heterogeneous.
In this paper, we show that ablation of the LmxM. 29.0340 (gene encoding Ku80 in L. mexicana) has resulted in the opposite telomeric phenotype, as compared to that in T. brucei. The telomers of L. mexicana get elongated upon deletion of Ku80. This phenotype was stable over at least 100 passages in vitro and was reversed back to the wild type by expression of the ectopic copy of Ku80. We also demonstrate that genome of L. mexicana ΔKu80 does not display instability traits, suggesting that Ku80 is not involved in its maintenance.
Throughout the long-term experiment, WT and ΔKu80 cell cultures were passaged 2 times a week for a total of 100 passages. Cells were sampled from passage numbers 0, 25, 50, 75, and 100.
Growth kinetics in vitro was analyzed as described previously [36]. The statistical significance was evaluated using unpaired t test in Prism v. 8.0.1 (GraphPad Software, San Diego, USA).

LmxM.08_29.1050 (Ku70) in Leishmania mexicana
The CRISPR-Cas9 L. mexicana strain was established using the plasmid pTB007 [37]. To ablate LmxM. 29.0340 and LmxM.08_29.1050, guide RNAs (gRNAs) with a 20 nt seed sequence targeting upstream and downstream regions of the genes of interest were amplified using corresponding 5´gRNA and 3´gRNA primers (hereafter all primer sequences are listed in S1 Table). Donor DNA was PCR amplified from the pTNeo_v1 plasmid [38] [41], and the region corresponding to the gene of interest was visually inspected. The genome was also assembled de novo by SPAdes genome assembler v. 3.13.0 [42] and visualized with Tablet v. 1.19.09.03 [43].
To confirm complete LmxM. 29.0340 ablation in L. mexicana we also employed Southern blotting as previously described [44] using ApaI-digested total genomic DNA from the midlog phase grown cells.
Finally, the expression of LmxM. 29.0340 and LmxM.08_29.1050 was examined by RT-qPCR. Total RNA was isolated and transcript levels of the proteins of interest in L. mexicana were measured as described previously [45,46] in three biological and technical replicates. Expression values were normalized to those of the 18S rRNA.

Ectopic expression of Ku80 in L. mexicana ΔKu80
The modified pLEXSY_IE-egfp-red-neo4 (Jena Bioscience GmbH, Jena, Germany) vector was used to express an ectopic copy of LmxM. 29.0340. Firstly, the neomycin resistance gene of the pLEXSY_IE-egfp-red-neo4 was replaced by a gene encoding streptothricin acetyltransferase (Sat, allowing selection with Nourseothricin) to generate pLEXSY_IE-egfp-red-sat. The Sat ORF was amplified from the plasmid pLEXSY-Sat2 (Jena Bioscience) and cloned into the pLEXSY_IE-egfp-red-neo4 with BamHI and SpeI (both from Thermo Fisher Scientific, Waltham, USA). The LmxM. 29.0340 ORF was amplified from the L. mexicana total genomic DNA and cloned into pLEXSY_IE-egfp-red-sat with BglII and NotI (Thermo Fisher Scientific), replacing EGFP-DsRed fusion gene. The resultant pLEXSY_IE-Ku80-sat plasmid was used for ectopic expression of LmxM. 29.0340.
The L. mexicana ΔKu80 promastigotes were grown in the complete M199 medium supplemented with 100 μg/ml Hygromycin B and Neomycin G418 Sulfate. The mid-log phase cells were transfected with 5 μg of pLEXSY_IE-Ku80-sat plasmid as described above. Positive transfectants were selected on complete M199 medium supplemented with 100 μg/ml Hygromycin B and 50-100 μg/ml Nourseothricin (Jena Bioscience). Ectopic expression of LmxM. 29.0340 was confirmed by RT-qPCR and Southern blotting as described above. The resulting strain is hereafter referred to as L. mexicana Ku80_add.

De novo sequencing and genome assembly
Leishmania mexicana (isolate MNYC/BZ/62/M379) genome was also assembled de novo with Flye v. 2.8.3 [47] using only PacBio reads reported previously [48] and genome size parameter 32 Mb, which is the size of reference L. mexicana MHOM/GT/2001/U1103 assembly [49] from the TriTrypDB release 52 [50]. This has been done in order to assemble longer contigs needed for genome instability analyses. The PacBio read N 50 and average genome coverage estimated by Flye were 9,022 bp and 130×, respectively. The initial Flye assembly with contig N 50 of 1,043 kbp and 73 total number of fragments was polished twice (Racon polishing) with TGS-GapCloser v. 1.1.1 [51], and then with Pilon v. 1.23 [52] using 30 million of high-quality Illumina read pairs (read length 150 bp). At this stage, the assembled scaffolds were broken into contigs, which were re-scaffolded with RaGOO v. 1.1 [53] using TriTrypDB L. mexicana assembly as a reference. The resulting assembly was again treated with TGS-GapCloser and no genome rearrangements (relative to L. mexicana MHOM/GT/2001/U1103) were detected at this step. The final assembly was quality-checked using QUAST v. 5.0.2 [54] and BUSCO v. 5.0.0 [55] tools.

Genome instability analysis
Leishmania mexicana genomes (ΔKu80 and wild type: 0, 25, 50, 75, 100 passages; 10 samples in total) were sequenced using Illumina NovaSeq platform at the Institute of Applied Biotechnology. Sequencing reads data obtained in frame of this work are deposed into NCBI SRA under BioProject PRJNA746247. Random subsamples of~17-18 million of read pairs were prepared with a custom Python script from each sequenced sample after their quality check and adapter trimming with Trimmomatic v. 0.39 [39]. Each subsample was mapped on the genome assembly using the BWA mem v. 0.7.17 [56] and alignments were processed with SAMtools v. 1.9 [57]. The SNP and short indel calling was performed with the Genome Analysis Toolkit (GATK) v. 4.2 using the 'HaplotypeCaller' tool [58] with default settings and with '-ploidy' set to 10 to capture possible variants with low frequency, as possible SNPs or indels caused by the Ku80 ablation could potentially be restricted to a fraction of cells only. The SNP and indel sets were compared using a custom Python script. Chimeric and split alignments, secondary alignments and read alignments with soft-clipped bases were counted in SAM files with custom shell scripts, using specific SAM tags and alignment CIGAR strings. General read mapping analyses were focused on counting the fractions of reads with specific properties, which can point to various recombination or mutation processes in genomes: number of unmapped, soft-clipped, chimeric alignments. SA:Z and XA:Z tags were used by BWA short read aligner for chimeric/split read alignments. Each subsample's sorted BAM file was analyzed for assembly identity using the ALE tool [59].

Plotting chimeric reads density
As a control dataset, a sample of L. mexicana M379, sequenced by The Wellcome Trust Sanger Institute in 2013 and deposed in NCBI SRA under accession ERR307335, was downloaded and processed as above. Reads were mapped on the M379 genome assembly with BWA mem v. 0.7.17. Chromosome sequence was binned in 24 kbps-long fragments, bin number was assigned to each mapped read or read fragment. Read counts were stored in 2D-matrix in such a way that the row in this matrix corresponds to the bin number, assigned to the first read in pair (or the 'leftmost' mapped read fragment) and the column is the bin number assigned to the second read in read pair (or 'rightmost' mapped read fragment). The read pairs mapped close to each other were counted in cells near diagonal of such matrix, while chimeric reads of putative "translocations" were counted in cells far from diagonal. Read counts in the matrix were first normalized over the total number of mapped reads, then diagonal elements of the matrices were replaced with zeros and all other values were scaled to (0, 1) range. As the entries of matrices are symmetrical across main diagonal, matrices for two different samples were joined.

Southern blotting and quantification of transcripts encoding telomereassociated proteins using RT-qPCR
Telomere lengths were analyzed by the terminal restriction fragment length analysis as in [15,20]. For the loading and integrity control, the membranes were stripped and re-probed against a fragment of an 18S rRNA gene. Statistics of the telomere lengths were obtained with an online tool WALTER (Web-based Analyser of the Length of TElomeRes) [60].
Transcripts encoding telomere-associated proteins were quantified by RT-qPCR in five biological replicates as in [15].

Establishment of the ΔKu80 (LmxM.29.0340 -/-) and add-back L. mexicana lines
In order to investigate function of the Ku proteins in Leishmania biology, we first ablated the Ku80-encoding gene (LmxM.29.0340) in L. mexicana. The clonal cultures, deficient in both LmxM. 29.0340 alleles (L. mexicana ΔKu80), were obtained using CRISPR/Cas9 system [38] (Fig 1A). The complete knock-out was confirmed by Southern blotting (Fig 1B), wholegenome Illumina sequencing (Fig 1C), and RT-qPCR (Fig 1D). In addition, we also established the add-back cultures, in which Ku80 was overexpressed episomally on the LmxM.29.0340-null background (Fig 1A) and verified it by Southern blotting (Fig 1B) and RT-qPCR (Fig 1D). Of note, the expression level of LmxM. 29.0340 in the add-back cells was higher compared to that of the wild type, but these numbers were not as dramatically different as often observed in the add-back experiments [61,62].
To exclude the clonal bias in data interpretation, we have selected four and three random clones of ΔKu80 and Ku80_add L. mexicana, respectively. As an additional control, we have also established a clonal line with ablated gene LmxM.08_29.1050 encoding Ku70 (S1 Fig).

Telomeres in L. mexicana ΔKu80 are elongated and this phenotype is stable over 100 passages
Leishmania mexicana is a model species with extremely short telomeres, as compared to other trypanosomatids [15]. Contrary to our expectations (based on the previously reported studies from another model trypanosomatid T. brucei [20]), the L. mexicana ΔKu80 telomeres were considerably extended upon LmxM. 29.0340 ablation (median telomere length 389 and 564 bp for the wild type and ΔKu80 cells, respectively). This phenotype was reversed in the add-back cells (median telomere length 431 bp), confirming that the observed effect is not an artefact of the genetic manipulations (Fig 1E and S2 Table). Of note, the results were similar in all analyzed randomly selected ΔKu80 and add-back clonal lines. Remarkably, the ablation of another Ku protein, Ku70, also manifested in elongated telomeres (S2 Fig and S2 Table).
We also investigated whether the observed telomeric phenotype is stable and confirmed that it is not altered for over 100 passages in culture (Fig 2). The L. mexicana cells with ablated Ku80 divide slightly slower compared to their wild type counterparts; this effect is reversed in the add-back lines (S3 Fig).

Expression patterns of some telomere-associated proteins and Ku80 correlate in L. mexicana
We have recently shown that the majority of proteins associated with telomeres in T. brucei are conserved across Trypanosomatidae [15]. To investigate the potential correlation between expression of the LmxM. 29.0340 and other genes encoding proteins implicated in telomere   29.0340 (Fig 3,  boxed). Compared to the wild type, their expression was statistically significant down and up regulated in the ΔKu80 and Ku80_add L. mexicana, respectively. Some notable entries on this list are TERT, ttaggg binding factor (TRF), and repressor activator protein 1 (RAP1), which confirm previous observations of their interactions with Ku proteins in other species [21,22,25,[63][64][65]. This is not a genome-wide phenomenon, as some of the analyzed genes do not follow the same pattern (Fig 3).

Deeper sequencing and genome instability analyses
In order to investigate possible influence of the Ku80 ablation on genome stability, we first performed high-quality chromosome-level genome assembly of L. mexicana MNYC/BZ/62/M379 (hereafter referred as "M379 assembly") and compared it to that of the L. mexicana MHOM/ GT/2001/U1103 assembly from the TriTrypDB ("reference assembly"). The final M379 assembly has 34 nuclear chromosomes (31,72 Mbps in total) along with a completely assembled circular contig of the mitochondrial maxicircle. The QUAST report indicated that 97.8% of the Illumina reads were mapped back onto the assembly, 99.99% of bases had non-zero coverage, and 99.98% of the reference L. mexicana assembly was covered by the M379 assembly contigs. The M379 assembly had averaged 10 Ns per 100 kb. In total, 594 sequence discrepancies were detected between the M379 and the reference L. mexicana assemblies. Most of these variants are rather short (under 1 kb) insertions or deletions (400 cases) or tandem repeat expansions/ contractions (96 cases); both types can be frequently observed in trypanosomatid genomes. No miss-assemblies were detected by Pilon, TGS-GapCloser, or read mapping of either long Pac-Bio or paired-end Illumina reads. The BUSCO completeness of the M379 assembly for 'eukaryota' dataset is 53.8% (137 out of 255 complete BUSCOs). This is the value comparable to the best available chromosome-level assemblies for trypanosomatids (Leishmania tarentolae: 53.8%, L. mexicana: 53.8%, L. major: 53.8%, Leptomonas pyrrhocoris: 55.3%, T. brucei: 55.7%). Altogether, this indicates that vast majority of the L. mexicana MNYC/BZ/62/M379 genome was properly assembled.
Next, we investigated genome stability by mapping the paired-end Illumina reads (150 bp) obtained from the WT and ΔKu80 L. mexicana collected after 0, 25, 50, 75 and 100 passages in vitro. Read mappings were analyzed with various tools that detect SNPs, short indels, longer structural variants indicated by the chimeric read alignments or misaligned read pairs, locus coverage variations, and overall likelihood of the assembly (ALE method). As genomic variations, caused by malfunction of the DNA repair system, can arise randomly in cells, the 'allele frequency' of each individual variant can be rather low. We accounted for that in the GATK haplotype calling by adjusting the ploidy parameter. We also analyzed the possible variants present at extremely low frequencies by counting overall number of mis-aligned, chimeric or soft-clipped reads in each sample (these read alignments indicate the sequence difference between the read and the reference and can capture events supported even by a single read sequence). The Wilcoxon signed rank test was used to conduct pairwise comparisons between the WT and ΔKu80 samples and no differences were found in any test (p-value < 0.05). Results and statistics of the analyses are summarized in the S3 Table. This is illustrated in the Fig 4 for chromosome 20 of L. mexicana. In both cell lines (ΔKu80 and WT), the observed frequency of putative translocations slightly increases over time (compare panels A and B), but this increasement is similar. Notably, we detected a higher translocation frequency when we compared genomic data for the same strain produced in 2013 and 2021 (panel C). This can be easily explained by the plasticity of Leishmania genomes-recombination events accumulate over time [66][67][68].
To conclude, we did not detect any statistically significant difference between the WT and ΔKu80 samples in any performed test. These results imply that ablation of LmxM. 29.0340 does not result in genome instability; although they do not exclude a possibility that the effect of this knock-out is delayed and manifests itself after more rounds of cell division.

Discussion
In this work we investigated the role of Ku80 in Leishmania mexicana biology and demonstrated that its ablation affects telomere length. Unlike situation in another model trypanosomatid species, Trypanosoma brucei [20], a complete knock-out of the Ku80 encoding gene (LmxM. 29.0340) resulted in telomere elongation (Fig 1). Of note, and, again, different from T. brucei, elongated telomeres in L. mexicana are stable for at least 100 passages in culture (Fig 2). A similar phenotype (telomere elongation upon Ku protein ablation) was previously observed in C. albicans and A. thaliana [21,22]. In these species, telomere elongation in the Ku deficient mutants was linked to alternative TERT-independent ways of telomere maintaining [21,25,69]. Notably, alternative mechanisms of telomer maintenance have been recently documented in Leishmania spp. [70]. The Ku 80 protein in L. mexicana appears to be central for the network of factors involved in telomere maintenance, as its expression at RNA level correlates with that of 13 out of 19 genes previously implicated in this process (Fig 3). On the other hand, we did not corroborate a recently proposed hypothesis that Ku proteins might direct genome stability in trypanosomatids [12].
Supporting information S1